Celestin Apprentice 5

home *** CD-ROM | disk | FTP | other *** search

/ Celestin Apprentice 5 / Apprentice-Release5.iso / Environments / PowerMacOberon feb96 / Source / MathL.Mod (.txt) < prev next >

Wrap

Oberon Text | 1994-11-15 | 14.9 KB | 276 lines | [TEXT/.Ob4]

Syntax10.Scn.Fnt Syntax10b.Scn.Fnt InfoElems Alloc Syntax10.Scn.Fnt StampElems Alloc 15 Nov 94 "Title": NewMathL "Author": Christoph Steindl (CS) "Abstract": Reimplementation of the mathematical functions exported by MathL. arctan and sin are based on algorithms published via the PowerPCNews as part of the MathLibFast package, the rest (sqrt, ln and exp) are based on algorithms used and described in the VS Fortran Language and Library Reference. The previously used MathLib shipped with the Metrowerks C-Compiler was slow and inaccurate. "Keywords": trigonometric functions, exponential function, square root function, logarithmic function, IEEE floating-point format "Version": 1.0 "From": 11.11.94 15:40:26 "Until": "Changes": "Hints": This text can again contain arbitrary text elements! FoldElems Syntax10.Scn.Fnt Courier10.Scn.Fnt Floating point format according to the IEEE standard: single precision: S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM 1 bit for the sign 8 bits for the exponent 23 bits for the mantissa ____________________________________________________________________________________________________ 32 bits = 4 bytes for one single precision floating point number The exponent is stored as an unbiased exponent, to get the real exponent (within range -126..127) you have to subtract 127 from the resulting number). The number 0 is represented as exponent = 0 and mantissa = 0. An exponent of 255 and a mantissa of 0 denotes infinity. An exponent of 255 and a mantissa of #0 denotes NaN. double precision: S EEEEEEEEEEE MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM 1 bit for the sign 11 bits for the exponent 52 bits for the mantissa ______________________________________________________________________________________________________ 64 bits = 8 bytes for one double precision floating point number The exponent is stored as an unbiased exponent, to get the real exponent (within range -1022..1024) you have to subtract 1023 from the resulting number). The number 0 is represented as exponent = 0 and mantissa = 0. An exponent of 2047 and a mantissa of 0 denotes infinity. An exponent of 2047 and a mantissa of #0 denotes NaN. Syntax10.Scn.Fnt Ln2 = 0.69314718055994530942D+00; FourLn2 = 4 * Ln2; piOver2 = pi / 2; piOver4 = pi / 4; twoPi = pi * 2; fourOverPi = 4 / pi; oneOverSqrt2 = 0.7071067811865475244D0; Syntax10.Scn.Fnt VAR sign, l: BOOLEAN; y: INTEGER; a, b, z: LONGREAL; BEGIN sign := x < 0; IF sign THEN x := -x END; l := FALSE; IF x >= 4 THEN l := TRUE; x := 1 / x; y := 0 ELSIF x < 0.25 THEN y := 0 ELSE y := SHORT(ENTIER(x / 0.5)); z := y * 0.5; x := (x - z) / (x * z + 1) END; z := x * x; b := ((((893025.0D+0 * z + 49116375.0D+0) * z + 425675250.0D+0) * z + 1277025750.0D+0) * z + 1550674125.0D+0) * z + 654729075.0D+0; a := (((13852575.0D+0 * z + 216602100.0D+0) * z + 891080190.0D+0) * z + 1332431100.0D+0) * z + 654729075.0D+0; a := (a / b) * x + ATanC[y]; IF l THEN a := piOver2 - a END; IF sign THEN RETURN -a ELSE RETURN a END END arctan; Syntax10.Scn.Fnt VAR neg: BOOLEAN; y, r, z: LONGREAL; BEGIN neg := x < 0; IF neg THEN x := -x END; IF x > twoPi THEN x := x - ENTIER(x / twoPi) * twoPi END; IF x > pi THEN x := x - pi; neg := ~neg END; IF x > piOver2 THEN x := pi - x END; IF x < piOver4 THEN y := x * fourOverPi; z := y * y; r := y * (((((((-0.202253129293D-13 * z + 0.69481520350522D-11) * z - 0.17572474176170806D-8) * z + 0.313361688917325348D-6) * z - 0.365762041821464001D-4) * z + 0.249039457019271628D-2) * z - 0.0807455121882807815D+0) * z + 0.785398163397448310D+0) ELSE y := (piOver2 - x) * fourOverPi; z := y * y; r := ((((((-0.38577620372D-12 * z + 0.11500497024263D-9) * z - 0.2461136382637005D-7) * z + 0.359086044588581953D-5) * z - 0.325991886926687550D-3) * z + 0.0158543442438154109D+0) * z - 0.308425137534042452D+0) * z + 1.0 END; IF neg THEN RETURN -r ELSE RETURN r END END sin; Syntax10.Scn.Fnt BEGIN IF x < 0 THEN x := -x END; IF x > twoPi THEN (* do range reduction here to limit roundoff on add of Pi/2 *) x := x - ENTIER(x / twoPi) * twoPi END; RETURN sin(x + piOver2) END cos; Syntax10.Scn.Fnt FoldElems Syntax10.Scn.Fnt 1. If x = 0, then the answer is 0. 2. Write x = 16^(2p-q) * m, where 2p-q is the exponent and q equals either 0 or 1; m is the mantissa and is within the range 1/16 <= m < 1. 3. Then, sqrt(x) = 16^p * 4^(-q) * sqrt(m) 4. For the first approximation of sqrt(x), compute the following: y0 = 16^p * 4^(1-q) * 0.2202 (m + 0.2587). The extrema of relative errors of this approximation for q = 0 are 2^-3.202 at m = 1.2^-3.265 at m = 0.2587, and 2^-2.925 at m=1/16. This approximation, rather than the minimax approximation, was chosen so that the quantity x/y3 - y3 below becomes less than 16^(p-8) in magnitude. This arrangement allows us to substitute short form counterparts of the long form instructions in the final iteration. 5. Apply the Newton Raphson iteration yn+1 = 1/2 (yn + x/yn) four times to y0, twice in the short form and twice in the long form. The final step is performed as y4 = y3 + 1/2 (x/y3 - y3) with an appropriate truncation maneuver to obtain a virtual rouding. The maximum relative error of the final result is theoretically 2^-63.23. VAR p, q, e, tmp: LONGINT; m, y0: LONGREAL; arr, arr2: ARRAY 8 OF SHORTINT; Algorithm, source: VS FORTRAN Language and Library Reference, Appendix D BEGIN IF x = 0 THEN RETURN 0 END; SYSTEM.MOVE(SYSTEM.ADR(x), SYSTEM.ADR(arr), 8); tmp := ASH(arr[1], -4); (* shift high-byte to low-byte *) e := ASH(arr[0], 4) + SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..27}) - 1023; (* clear leading sign bits, subtract bias *) tmp := arr[1]; tmp := SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..23} + {24..27}); arr2[0] := 63; arr2[1] := SHORT(SHORT(tmp)); (* positive, exponent = 0 (1023 - 1023), mantissa the same *) arr2[2] := arr[2]; arr2[3] := arr[3]; arr2[4] := arr[4]; arr2[5] := arr[5]; arr2[6] := arr[6]; arr2[7] := arr[7]; SYSTEM.MOVE(SYSTEM.ADR(arr2), SYSTEM.ADR(m), 8); WHILE (m >= 1) OR (m < 1) & (e MOD 4 # 0) DO m := m / 2; INC(e) END; e := e DIV 4; IF ODD(e) THEN p := (e DIV 2) + 1; q := 1 ELSE p := e DIV 2; q := 0 END; tmp := 1023 + 4 * p + 2 - 2 * q; arr2[0] := SHORT(SHORT(tmp DIV 16)); arr2[1] := SHORT(SHORT((tmp MOD 16) * 16)); arr2[2] := 0; arr2[3] := 0; arr2[4] := 0; arr2[5] := 0; arr2[6] := 0; arr2[7] := 0; SYSTEM.MOVE(SYSTEM.ADR(arr2), SYSTEM.ADR(y0), 8); (* y0 := 16^p * 4^-q *) y0 := y0 * 0.2202 * (m + 0.2587); y0 := 0.5 * (y0 + x / y0); y0 := 0.5 * (y0 + x / y0); y0 := y0 + 0.5 * (x / y0 - y0); y0 := y0 + 0.5 * (x / y0 - y0); RETURN y0 END sqrt; Syntax10.Scn.Fnt FoldElems Syntax10.Scn.Fnt 1. Write x = 16^p * 2^-q * m where p is the exponent, q is an integer, 0 <= q <= 3, and m is within the range 1/2 <= m < 1. 2. Define two constants, a and b (where a = base point and 2^-b = a), as follows: If 1/2 <= m < 1/sqrt(2), then a = 1/2 and b = 1. If 1/sqrt <= m < 1, then a = 1 and b = 0 3. Write z = (m - a)/(m + a). Then, m = a * (1 + z)/(1 - z) and abs(z) < 0.1716. 4. Now, x = 2^(4p-q-b) * (1 + z)/(1 - z), and ln(x) = (4p - q - b) * ln2 + ln((1 + z)/(1 - z)). 5. To obtain ln((1 + z)/(1 -z)), first compute w = 2z = (m - a)/(0.5m + 0.5a) (which is represented with slightly more significant digits than z itself) and apply an approximation of the following form: ln((1 + z)/(1 - z)) ~ w * (c0 + c1 * w^2 * (w^2 + c2 + c3/(w^2 + c4 + c5/(w^2 + c6)))). These coefficients were obtained by the minimax rational approximation of 1/2z * ln((1 + z)/(1 - z)) over the range 0 <= z^2 <= 0.02944 under the constraint that c0 shall be exactly 1.0. The maximum relative error of this approximation is less than 2^-60.55. 6. If the common logarithm is desired, then log(x) = log(e) * ln(x). VAR arr, arr2: ARRAY 8 OF SHORTINT; e, p, q, tmp: LONGINT; a, b, z, m, e1, e2, e3, e4, e5: LONGREAL; Algorithm, double precision, source: VS FORTRAN Language and Library Reference, Appendix D BEGIN SYSTEM.MOVE(SYSTEM.ADR(x), SYSTEM.ADR(arr), 8); tmp := ASH(arr[1], -4); (* shift high-byte to low-byte *) e := ASH(arr[0], 4) + SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..27}) - 1023; (* clear leading sign bits, subtract bias *) tmp := arr[1]; tmp := SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..23} + {24..27}); arr2[0] := 63; arr2[1] := SHORT(SHORT(tmp)); (* positive, exponent = 0 (1023 - 1023), mantissa the same *) arr2[2] := arr[2]; arr2[3] := arr[3]; arr2[4] := arr[4]; arr2[5] := arr[5]; arr2[6] := arr[6]; arr2[7] := arr[7]; SYSTEM.MOVE(SYSTEM.ADR(arr2), SYSTEM.ADR(m), 8); WHILE m < 0.5 DO m := m * 2; DEC(e) END; p := (e + 3) DIV 4; q := 4 * p - e; IF (0.5 <= m) & (m < oneOverSqrt2) THEN a := 0.5; b := 1 ELSE a :=1; b := 0 END; z := (m - a) / (m + a); m := a * (1 + z) / (1 - z); (* Mathematica: N[EconomizedRationalApproximation[Log[(1+x)/(1-x)], {x, {0, 0.2}, 5, 5}], 20] *) e1 := (-0.1D0 + z); e2 := (-0.1D0 + z) * e1; e3 := (-0.1D0 + z) * e2; e4 := (-0.1D0 + z) * e3; e5 := (-0.1D0 + z) * e4; RETURN (4 * p - q - b) * Ln2 + (0.1214915623326932 * e5 + 0.3575522028414829 * e4 - 1.405671421672976 * e3 - 1.017778931767947 * e2 + 1.90514956852507 * e1 + 0.1992525344438533) / (- 0.02387870693365428 * e5 + 0.2131120706630974 * e4 + 0.3320091133574482 * e3 - 1.025888641845158 * e2 - 0.5021932584965673 * e1 + 0.992932894287171); END ln; Syntax10.Scn.Fnt FoldElems Syntax10.Scn.Fnt 1. IF x < -180.218, then 0 is given as the answer via floating-point underflow. 2. Otherwise, divide x by ln2 and write x = (4a - b - c/16) * ln2 - r where a, b and c are integers, 0 <= b <= 3 and 0 <= c <=15, and the remainder r is within the range 0 <= r < 1/16*ln2. This reduction is carried out in an extra precision to ensure accuracy. Then exp(x) = 16^a * 2^-b * 2^(-c/16) * e^-r. 3. Compute e^-r by using a minimax polynomial approximation of degree 6 over the range 0 <= r <= 1/16 * ln2. In obtaining coefficients of this approximation, the minimax of relative errors was taken under the constraint that the constant term a0 shall be exactly 1. The relative error is less than 2^-56.87. 4. Multiply e^-r by 2^(-c/16). The 16 values of 2^(-c/16) for 0 <= c <= 15 are included in the subprogram. Then halve the result b times. 5. Finally, add the hexadecimal exponent of a to the characteristic of the answer. VAR a, b, c, e, tmp: LONGINT; minusR, tmpLR, res, e1, e2, e3, e4: LONGREAL; arr: ARRAY 8 OF SHORTINT; Algorithm for double precision, source: VS FORTRAN Language and Library Reference, Appendix D BEGIN IF x > 0 THEN a := ENTIER(x / FourLn2 + 1); (* after rearranging the formula: a = ((x+4)/ln2 + b + c/16)/4 = x/4 + r/(4ln2) + b/4 + c/16; r/(4ln2) < 1/64, b/4 < 1, c/64 < 1/4 *) tmpLR := - x / Ln2 + 4.D0 * a; b := ENTIER(tmpLR); (* after rearranging the formula: b = -(x+r)/Ln2 - c/16 + 4a = -x/Ln2 - r/Ln2 - c/16 + 4a; r/Ln2 < 1/16, c/16 < 1/16 *) c := ENTIER((tmpLR - b) * 16); (* after rearranging the forumla: c = ((-(x+r)/Ln2 + 4a - b)*16 = -x/Ln2 - r/Ln2 + 4a - b)*16; r/Ln2 < 1/16, b <=3 *) minusR := (b + c / 16.D0 - 4 * a) * Ln2 + x; e1 := 0.025D0 + minusR; e2 := e1 * e1; e3 := e2 * e1; e4 := e3 * e1; res := (e4 * 0.0005805416143026127 + e3 * 0.01161087764086548 + e2 * 0.1044980348337474 + e1 * 0.4876576773120971 + 0.97531535462419) / (e4 * 0.0005952380952380827 - e3 * 0.01190480840773792 + e2 * 0.1071434151801274 - e1 * 0.5000027901879113 + 1.000005580375827); res := res * H1[c]; SYSTEM.MOVE(SYSTEM.ADR(res), SYSTEM.ADR(arr), 8); tmp := ASH(arr[1], -4); (* arr[1] contains the lower 4 bits of the exponent in its higher 4 bits => shift high-byte to low-byte *) e := ASH(arr[0], 4) + SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..27}) + 4 * a - b; (* arr[0] contains the higher 7 bits of the exponent, the sign bit is 0 => shift arr[0], add the lower 4 bits of the exponent, where we first have to clear leading sign bits which resulted from ASH. Finally we add 4 * a and subtract b in order to multiply res by 16^a and 2^-b. *) tmp := arr[1]; tmp := SYSTEM.VAL(LONGINT, SYSTEM.VAL(SET, tmp) - {0..27}); (* only the 4 LSB *) arr[0] := SHORT(SHORT(e DIV 16)); arr[1] := SHORT(SHORT(((e MOD 16) * 16) + tmp)); (* set new exponent keeping the old mantissa *) SYSTEM.MOVE(SYSTEM.ADR(arr), SYSTEM.ADR(res), 8); RETURN res ELSIF x = 0 THEN RETURN 1 ELSIF x < -180.218 THEN RETURN 0 ELSE RETURN 1 / exp(-x) END END exp; Syntax10.Scn.Fnt BEGIN ATanC[0] := 0.D0; ATanC[1] := 0.4636476090008061165D0; ATanC[2] := 0.7853981633974483094D0; ATanC[3] := 0.98279372324732906714D0; ATanC[4] := 1.1071487177940905022D0; ATanC[5] := 1.1902899496825317322D0; ATanC[6] := 1.2490457723982544262D0; ATanC[7] := 1.2924966677897852673D0; ATanC[8] := 1.3258176636680324644D0; H1[0] := 1.D0; (* values of 2^(x/16) *) H1[1] := 0.95760328069857364694D0; H1[2] := 0.91700404320467123174D0; H1[3] := 0.87812608018664974156D0; H1[4] := 0.84089641525371454303D0; H1[5] := 0.80524516597462715409D0; H1[6] := 0.77110541270397041181D0; H1[7] := 0.73841307296974965569D0; H1[8] := 0.7071067811865475244D0; H1[9] := 0.67712777346844636415D0; H1[10] := 0.64841977732550483297D0; H1[11] := 0.6209289060367420243D0; H1[12] := 0.59460355750136053336D0; H1[13] := 0.56939431737834582685D0; H1[14] := 0.5452538663326288296D0; H1[15] := 0.52213689121370692016; MODULE MathL; Floating point format according to the IEEE standard IMPORT SYSTEM; CONST e- = 2.7182818284590452354D0; pi- = 3.14159265358979323846D0; local constants String- = POINTER TO ARRAY 64 OF CHAR; ecvt-: PROCEDURE (x: LONGREAL; n: INTEGER): String; ATanC: ARRAY 9 OF LONGREAL; H1: ARRAY 16 OF LONGREAL; PROCEDURE arctan* (x: LONGREAL): LONGREAL; PROCEDURE sin* (x: LONGREAL): LONGREAL; PROCEDURE cos* (x: LONGREAL): LONGREAL; PROCEDURE sqrt* (x: LONGREAL): LONGREAL; PROCEDURE ln* (x: LONGREAL): LONGREAL; PROCEDURE exp* (x: LONGREAL): LONGREAL; initialization of pseudo-constants END MathL.